Disambiguation of patent inventors and assignees using high-resolution geolocation data

نویسندگان

  • Greg Morrison
  • Massimo Riccaboni
  • Fabio Pammolli
چکیده

Patent data represent a significant source of information on innovation, knowledge production, and the evolution of technology through networks of citations, co-invention and co-assignment. A major obstacle to extracting useful information from this data is the problem of name disambiguation: linking alternate spellings of individuals or institutions to a single identifier to uniquely determine the parties involved in knowledge production and diffusion. In this paper, we describe a new algorithm that uses high-resolution geolocation to disambiguate both inventors and assignees on about 8.5 million patents found in the European Patent Office (EPO), under the Patent Cooperation Treaty (PCT), and in the US Patent and Trademark Office (USPTO). We show this disambiguation is consistent with a number of ground-truth benchmarks of both assignees and inventors, significantly outperforming the use of undisambiguated names to identify unique entities. A significant benefit of this work is the high quality assignee disambiguation with coverage across the world coupled with an inventor disambiguation (that is competitive with other state of the art approaches) in multiple patent offices.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Technological Trends Analysis of Fuel Cell Electric Vehicle using Patent Information

Patent citation information is created based on the applications submitted by the inventors and a search report presented by examiners from the patent organization. This information contain the difference between rarely cited and frequently cited patents. For the patents registered in the U.S.A., Europe, and Japan, the references are presented with the application when created and examined. The...

متن کامل

Extracting the significant-rare keywords for patent analysis

Brainstorming for keywords is used in retrieving patent documents, but even experienced engineers are irresolute in dealing with this critical issue. The quality of a patent report is usually already determined by the keywords they used in the first step. In order to improve the stumbling stone, this paper demonstrates a newmethod of how to find the significant-rare in a patent database. The re...

متن کامل

Development of a Patent Matching System Using a Hybrid Approach

There were many researches about applying various data mining or text mining tools to patent analysis, and there were many scholars and experts have verified the accuracy and the feasibility of those tools. However, since mining tools always tried to analyze the content using some mathematic methodology, such as linguistic algorithms, they neglect the fact that patent records are combinations o...

متن کامل

Finding small molecules for the ‘next Ebola’

The current Ebola virus epidemic may provide some suggestions of how we can better prepare for the next pathogen outbreak. We propose several cost effective steps that could be taken that would impact the discovery and use of small molecule therapeutics including: 1. text mine the literature, 2. patent assignees and/or inventors should openly declare their relevant filings, 3. reagents and assa...

متن کامل

How To Kill Inventors: Testing The Massacrator

Inventor disambiguation is an increasingly important issue for users of patent data. We propose and test a number of refinements to the Massacrator algorithm, originally proposed by Lissoni et al. (2006) and now applied to APE-INV, a free access database funded by the European Science Foundation. Following Raffo and Lhuillery (2009) we describe disambiguation as a 3-step process: cleaning&parsi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 4  شماره 

صفحات  -

تاریخ انتشار 2017